^ to ^ 

PATENTS 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

O : 

===== H : 

In re Application of: Li How Chen and Harry Meade 1 

Serial No.: To be assigned Ho\"i 

Filing Date: HEREWITH ~ 

Title: NOVEL MODIFIED NUCLEIC ACID SEQUENCES AND 

METHODS FOR INCREASING mRNA LEVELS AND PROTEIN 
EXPRESSION IN CELL SYSTEMS 



Docket Number: 107.637.121-B 

BOX PATENT APPLICATION 

Assistant Commissioner for Patents 
Washington, DC 20231 



CERTIFICATION UNDER 37 CFR 1.10 

I Iwar by certify that the attached papers are being deposited with the United States 
Postal Service as "Express Mail Post Office to Addressee" Service under 37 CFR 1.10 
on October 20, 1998 and is addressed to: BOX PATENT APPLICATION, Assistant 
Commissioner for Patents, Washington, D.C. 20231 

EM208595045US 
"Express Mail Label No/' 

TRANSMITTAL LETTER 

Dear Sir: 

Enclosed herewith for filing in the United States Patent and Trademark Office for the 
above-referenced application are the following documents: 




1. New United States Patent Application filed under 37 C.F.R. § 1.53(b) entitled: 



NOVEL MODIFIED NUCLEIC ACID SEQUENCES AND METHODS FOR 
INCREASING mRNA LEVELS AND PROTEIN EXPRESSION IN CELL SYSTEMS 



and naming as inventors: 
CHEN, Li How 
MEADE, Henry 

the application contains 42 pages comprising: 
20 pages of the specification; 
5 pages of the claims (26 claims which 10 are independent and 4 are 

multiple-dependent claims); 
1 page of the abstract; and 
16 sheets of informal drawings comprising Figures 1 to 13. 

2. Small Entity Statement (unexecuted). 

3. Return postcard. 

This application claims priority to U.S.S.N. 60/095,649 filed on the 15 May 1998 and 
U.S.S.N. 60/062,592 filed 20 October 1997. 

Please charge our deposit account, 08-0219, the required filing fee of $395.00 pursuant 
to 37 C.F.R. §1.16(a) , the required excess independent claims fee of $287.00 pursuant to 37 
C.F.R. §1.16(b), and the required multiple dependent claim fee of $540.00 pursuant to 37 
C.F.R. §1. 16(d). The total amount of fees to be charged to our deposit account is $1222.00 
For this purpose we enclose a duplicate copy of this document. The fees charged reflect the 
small entity status of this application. 



60 State Street 
Boston, MA 02109 
Tel: (617) 526-6000 
Fax: (617) 526-5000 

Date: October 20, 1998 



Respectfully submitted. 



HALE AND DORR LLP 




Wa>4)/ A. Keown, Ph.D. 
Registration No. 33,923 
Attorney for Applicants 



2 



NOVEL MODIFIED NUCLEIC ACID SEQUENCES 
AND METHODS FOR INCREASING mRNA LEVELS AND PROTEIN 

EXPRESSION IN CELL SYSTEMS 



BACKGROUND OF THE INVENTION 

Field of the invention 

The invention relates to heterologous gene expression. More particularly, the 
invention relates to the expression of microbial or parasitic organism genes in 
higher eukaryote cell systems. 

Stimmary of the related art 

Recombinant production of certain heterologous gene products is often 
difficult in in vitro cell culture systems or in vivo recombinant production systems. 
For example, many researchers have found it difficult to express proteins derived 
from bacteria, parasites and virus in cell culture systems different from the cell from 
which the protein was originally derived, and particularly in mammalian cell 
culture systems. One example of a therapeutically important protein which has 
been difficult to produce by mammalian cells is the malaria merozoite surface 
protein (MSP-1). 

Malaria is a serious heath problem in tropical countries. Resistance to 
existing drugs is fast developing and a vaccine is urgently needed. Of the number of 
antigens that get expressed during the life cycle of P. falciparum, MSP-1 is the most 
extensively studied and promises to be the most successful candidate for 
vaccination. Individuals exposed to P. falciparum develop antibodies against 
MSP-1, and studies have shown that there is a correlation between a naturally 
acquired immune response to MSP-1 and reduced malaria morbidity. In a number 
of studies, immunization with purified native MSP-1 or recombinant fragments of 
the protein has induced at least partial protection from the parasite (Diggs et al, 
(1993) Parasitol. Today 9:300-302). Thus MSP-1 is an important target for the 



development of a vaccine against P. falcijjanim. 



MSP-1 is a 190-220 kDA glycoprotein. The C-terminal region has been the 
focus of recombinant production for use as a vaccine. However, a major problem in 
developing MSP-1 as a vaccine is the difficulty in obtaining recombinant proteins in 
bacterial or yeast expression systems that are equivalent in immunological potency 
to the affinity purified native protein (Chang et al, (1992) /. Immunol 148:548-555.) 
and in large enough quantities to make vaccine production feasible. 

Improved procedures for enhancing expression of sufficient quantities of 
proteins derived from parasite, bacterial and viral organisms which have previously 
been difficult to produce recombinantly would be advantageous. In particular, a 
recombinant system capable of expressing MSP-1 in sufficient quantities would be 
particularly advantageous. 



BRIEF SUMMARY OF THE INVENTION 

The present invention provides improved recombinant DNA compositions 
5 and procedures for increasing the mRNA levels and protein expression of proteins 
derived from heterologous cells, preferably those of lower organisms such as 
bacteria, virus, and parasite, which have previously been difficult to express in cell 
culture systems, mammalian cell culture systems, or in transgenic mammals. The 
preferred protein candidates for expression in an expression system in accordance 
10 with the invention are those proteins having DNA coding sequences comprising 
high overall AT content or AT rich regions, and/or mRNA instability motifs 
and/or rare codons relative to the recombinant expression systems. 

In a first aspect, the invention features a modified known nucleic acid, 
preferably a gene from a bacterium, virus or parasite, capable of being expressed in a 
15 system, wherein the modification comprises a reduced AT content, relative to the 
unmodified sequence, and optionally further comprises elimination of at least one 
or all mRNA instability motifs present in the natural gene. In certain preferred 
embodiments the modification further comprises replacement of one or more 
codons of the natural gene with preferred codons of the cell system. 
20 In a second aspect, the invention provides a process for preparing a modified 

nucleic acid of the invention comprising the steps of lowering the overall AT 
content of the natural gene encoding the protein, and/or eliminating at least one or 
all mRNA instability motifs and/or replacing one or more codons with a preferred 
codon of the cell system of choice, all by replacing one or more codons in the natural 
25 gene with codons recognizable to, and preferably with codons preferred by the cell 
system of choice and which code for the same amino acids as the replaced codon. 
This aspect of the invention further includes modified nucleic acids prepared 
according to the process of the invention. 

In a third aspect, the invention also provides vectors comprising nucleic acids 
30 of the invention and promoters active in the cell line or organism of choice, and 



host cells transformed with nucleic acids of the invention. 

In a fourth aspect, he invention provides transgenic expression vectors for 
the production of transgenic lactating animals comprising nucleic acids of the 
invention as well as transgenic non-human lactating animals whose germlines 
5 comprise a nucleic acid of the invention. 

In a fifth aspect, he invention provides a transgenic expression vector for 
production of a transgenic lactating animal species comprising a nucleic acid of the 
invention, a promoter operatively coupled to the nucleic acid which directs 
mammary gland expression of the protein encoded by the nucleic acid into the milk 
10 of the transgenic animal 

In a sixth aspect, the invention provides a DNA vaccine comprising a 
modified nucleic acid according to the invention. A preferred embodiment of this 
aspect of the invention comprises a fragment of a modified MSP-1 gene according to 
the invention. 
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DESCRIPTION OF THE DRAWINGS 

Fig. 1 depicts the cDNA sequence of MSP-I42 modified in accordance with the 
invention [SEQ ID NO 1] in which 305 nucleotide positions have been replaced to 
5 lower AT content and eliminate mRNA instability motifs while maintaining the 
same protein amino acid sequence of MSP-I42. The large letters indicate nucleotide 
substitutions- 

Fig. 2 depicts the nucleotide sequence coding sequence of the ''wild type'' or native 
10 MSP- 142 [SEQ ID NO 2]. 

Fig 3a is a codon usage table for wild type MSP-I42 (designated "MSP wt" in the table) 
and the new modified MSP-I42 gene (designated "edited MSP" in the table) and 
several milk protein genes (casein genes derived from goats and mouse). The 
numbers in each column indicate the actual number of times a specific codon 
appears in each of the listed genes. The new MSP-I42 synthetic gene was derived 
from the mammary specific codon usage by first choosing GC rich codons for a given 
amino acid combined with selecting the amino acids used most frequently in the 
milk proteins. 

Fig 3b is a codon usage table comparing the number of times each codon appears in 
both the wild type MSP-I42 (designated "MSP wf' in the table) and the new modified 
MSP-l42.gene (designated "edited MSP'' in the table) as is also shown in the table in 
Fig. 3a. The table in Fig. 3b, also compares the frequency in which each codon 
appears in the wild type MSP-I42 and the new modified MSP-I42 gene, to the 
frequency of appearance of each codon in both Exoli genes and human genes. Thus, 
if the expression system were Exoli cells, this table may be used to determine what 
codons are recognized by, or preferred by Exoli, 

Fig. 4a-c depict MSP-I42 constructs GTC 479, GTC 564, and GTC 627, respectively as 
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are described in the examples. 



Fig. 5 panel A is a Northern analysis wherein construct GTC627 comprises the new 
MSP-I42 gene modified in accordance with the invention, GTC479 is the construct 
5 comprising the native MSP-I42 gene, and construct GTC469 is a negative control 
DNA 

Fig 5 panel B is a Western analysis wherein the eluted fractions after affinity 
purifications. Numbers are collected fractions. The results show that fractions from 
GTC679 the modified MSP-I42 synthetic gene construct reacted with polyclonal 
10 antibodies to MSP-1 and the negative control GTC479 did not. 

Fig 6 depicts the nucleic acid sequences of OTl [SEQ ID NO 3], OT2 [SEQ ID NO 4], 
MSP-8 [SEQ ID NO 5] MSP-2 [SEQ ID NO 6] and MSPl [SEQ ID NO 7] described in the 
Examples. 

15 

Fig 7 is a schematic representation of plasmid BC574. 
Fig 8 is a schematic representation of BC620. 

20 Fig 9 is a schematic representation of BC670. 

Fig 10 is a representation of a Western blot of MSP in transgenic milk- 
Fig 11 is a schematic representation of the nucleotide sequence of MSP42-2 [SEQ ID 

25 NO 8]. 

Fig 12 is a schematic representation of the BC-718. 

Fig 13 is a representation of a Western blot of BC-718 expression in transgenic milk. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 



The patent and scientific literature referred to herein establishes the 
knowledge that is available to those with skill in the art. The issued US patents, 
5 allowed applications, published foreign applications, and references cited herein are 
hereby incorporated by reference. Any conflicts between these references and the 
present disclosure shall be resolved in favor of the present disclosure. 

The invention provides modified recombinant nucleic acid sequences 
10 (preferably DNA) and methods for increasing the mRNA levels and protein 

expression of proteins which are known to be, or are likely to be, difficult to express 
in cell culture systems, mammalian cell culture systems, or in transgenic animals. 
The preferred "difficult" protein candidates for expression using the recombinant 
techniques of the invention are those proteins derived from heterologous cells 
15 preferably those of lower organisms such as parasites, bacteria, and virus, having 
DNA coding sequences comprising high overall AT content or AT rich regions 
and/or mRNA instability motifs and/or rare codons relative to the recombinant 
expression system to be used. 

20 In a first aspect, the invention features a modified known nucleic acid, 

preferably a gene from a bacterium, virus or parasite, capable of being expressed in a 
cell system, wherein the modification comprises a reduced AT content, relative to 
the unmodified sequence, and optionally further comprises elimination of at least 
one or all mRNA instability motifs present in the natural gene. A "cell system" 

25 includes cell culture systems, tissue culture systems, organ culture systems and 
tissues of living animals. In certain preferred embodiments the modification 
further comprises replacement of one or more codons of the natural gene with 
preferred codons of the cell system. Each of these features are achieved by replacing 
one or more codons of the natural gene with codons recognizable to, and preferably 

30 preferred by the cell system that encode the same amino acid as the codon which 
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was replaced in the natural gene. In accordance with the invention, such "silent" 
nucleotide and codon substitutions should be sufficient to achieve the goal lowering 
AT content and/or of eliminating mRNA instability motifs, and/or reducing the 
number of rare codons, while maintairung, and preferably improving the ability of 
the cell system to produce mRNA and express the desired protein. 

Also included in the invention are those sequences which are specifically 
homologous to the modified nucleic acids of the invention under suitable stringent 
conditions, specifically excluding the known nucleic acids from which the modified 
nucleic acids are derived. A sequence is "specifically homologous" to another 
sequence if it is sufficiently homologous to specifically hybridize to the exact 
complement of the sequence. A sequence "specifically hybridizes" to another 
sequence if it hybridizes to form Watson-Crick or Hoogsteen base pairs either in the 
body, or under conditions which approximate physiological conditions with respect 
to ionic strength, e.g., 140 mM NaCl, 5 mM MgCl2. Preferably, such specific 
hybridization is maintained under stringent conditions, e.g., 0.2X SSC at 68°C. 

In preferred embodiments, the nucleic acid of the invention is capable of 
expressing the protein in mammalian cell culture, or in a transgenic animal at a 
level which is at least 25%, and preferably 50% and even more preferably at least 
100% or more of that expressed by the natural gene in an in vitro cell culture system 
or in a transgenic animal under identical conditions (i.e. the same cell type, same 
culture conditions, same expression vector). 

As used herein, the term "expression" is meant mRNA transcription 
resulting in protein expression. Expression may be measured by a number of 
techniques known in the art including using an antibody specific for the protein of 
interest. By "natural gene" or "native gene" is meant the gene sequence, or 
fragments thereof (including naturally occurring allehc variations), which encode 
the wild type form of the protein and from which the modified nucleic acid is 
derived. A "preferred codon "means a codon which is used more prevalently by the 
cell system of choice. Not all codon changes described herein are changes to a 
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preferred codori/ so long as the codon replaceraent is a codon which is at least 
recognized by the cell system. The term "reduced AT content'' as used herein means 
having a lower overall percentage of nucleotides having A (adenine) or T (thymine) 
bases relative to the natural gene due to replacement of the A or T containing 

5 nucleotide positions or A and/or T containing codons with nucleotides or codons 
recognized by the cell system of choice and which do not change the amino acid 
sequence of the target protein. "Heterologous" is used herein to denote genetic 
material originating from a different species than that into which it has been 
introduced, or a protein produced from such genetic material. 

10 Particularly preferred cell systems of the invention include mammalian cell 

culture systems such as COS cells and CHO cells, as well as transgenic animals, 
particularly the mammary tissue of transgenic animals. However, the invention 
also contemplates bacteria, yeast, E. coli, and viral expression systems such as 
baculovirus and even plant systems. 

15 

In a second aspect, the invention provides a process for preparing a modified 
nucleic acid of the invention comprising the steps of lowering the overall AT 
content of the natural gene encoding the protein, and/ or eliminating at least one or 
all mRNA instability motifs and/or replacing one or more codons with a preferred 

20 codon of the cell system of choice, all by replacing one or more codons in the natural 
gene with codons recognizable to, and peferably with codons preferred by the cell 
system of choice and which code for the same amino acids as the replaced codon. 
Standard reference works describing the general principals of recombinant DNA 
technology include Watson, J.D. et al, Molecular Biology of the Gene, Volumes I and 

25 II the Benjamin/ Cummings Publishing Company, Inc. publisher, Menlo Park, CA 
(1987) Darnell, J.E. et al. Molecular Cell Biologif, Scientific American Books, Inc., 
Publisher, New York, NY (1986); Old, R.W., et aL, Principles of Gene Manipulation: 
An Introduction to Genetic Engineering, 2d edition. University of California Press, 
publisher, Berkeley CA (1981); Maniatis, T., et al, Molecular Cloning: A Laboratory 



Manual, 2^^ ed. Cold Spring Harbor Laboratory, publisher. Cold Spring Harbor, NY 
(1989) and Current Protocols in Molecular Biologi/, Ausubel et aL, Wiley Press, New 
York, NY (1992). This aspect of the invention further includes modified nucleic 
acids prepared according to the process of the invention. 
5 Without being limited to any theory, previous research has indicated that a 

conserved AU sequence (AUUUA) from the 3' untranslated region of GM-CSF 
mRNA mediates selective mRNA degradation (Shaw, G. and Kamen, R. Cell 
46:659-667). The focus in the past has been on the presence of these instability motifs 
in the untranslated region of a gene. The instant invention is the first to recognize 
10 an advantage to eliminating the instability sequences in the coding region of a gene. 

In a third aspect, the invention also provides vectors comprising nucleic acids 
of the invention and promoters active in the cell line or organism of choice, and 
host cells transformed with nucleic acids of the invention. Preferred vectors include 
an origin of replication and are thus replicatable in one or more cell type. Certain 
preferred vectors are expression vectors, and further comprise at least a promoter 
and passive terminator, thereby allowing transcription of the recombinant 
expression element in a bacterial, fungal, plant, insect or mammalian cell. 

Q 20 In a fourth aspect, he invention provides transgenic expression vectors for 

^ the production of transgenic lactating animals comprising nucleic acids of the 

invention as well as transgenic non-human lactating animals whose germlines 
comprise a nucleic acid of the invention. Such transgenic expression vectors 
comprise a promoter capable of being expressed as part of the genome of the host 
25 transgenic animal. General principals for producing transgenic animals are known 
in the art. See for example Hogan et aL, Manipulating the Mouse Embryo: A 
Laboratory Manual^ Cold Spring Harbor Laboratory, (1986); Simons et ai. 
Bio /Technology 6:179-183, (1988); Wall et al, Biol Reprotl 32:645-651, (1985); Buhler 
et al, Bio/Technology, 8:140-143 (1990); Ebert et al., Bio/Technology 9:835-838 (1991); 

10 




Krimenfort et aL, Bio/Technology 9:844-847 (1991); Wall et aL, J,Cell Biocheni. 
49:113-120 (1992). Techniques for introducing foreign DNA sequences into 
mammals and their germ cells were originally developed in the mouse. See e.g., 
Gordon et aL, Proc, Natl Acad. Sci USA 77:7380-7384, (1980); Gordon and Ruddle, 
5 Science 214: 1244-1246 (1981); Palmiter and Brinster, Cell 41: 343-345, 1985; Brinster et 
al, Proc Natl Acad ScL, USA 82:4438-4442 (1985) and Hogan et al. (ibid,). These 
techniques were subsequently adapted for use with larger animals including cows 
and goats. Up until very recently, the most widely used procedure for the 
generation of transgenic mice or livestock, several hundred linear molecules of the 
10 DNA of interest in the form of a transgenic expression construct are injected into 
one of the pro- nuclei of a fertilized egg. Injection of DNA into the cytoplasm of a 
fl zygote is also widely used. Most recently cloning of an entire transgenic cell line 

0 capable of injection into an unfertilized egg has been achieved (KHS Campbell et al, 
2 Nature 380 64-66, (1996)). 

15 

01 In a fifth aspect, he invention provides a transgenic expression vector for 

y production of a transgenic lactating animal species comprising a nucleic acid of the 

f ^ invention^ a promoter operatively coupled to the nucleic acid which directs 

Q mammary gland expression of the protein encoded by the nucleic acid into the milk 

n 20 of the transgenic animal. The mammary gland expression system has the 
^ advantages of high expression levels, low cost, correct processing and accessibility. 

Known proteins, such as bovine and human alpha- lactalbumin have been 
produced in lactating transgenic animals by several researchers. (Wright et al, 
Bio/Technology 9:830-834 (1991); Vilotte et al, Eur. /. Bzocten2.,186:43-48 (1989); 
25 Hochi et al., Mol Reprod, And Devel 33:160-164 (1992); Soulier et al, FEBS Letters 
297(1,2):13-18 (1992)) and the system has been shown to produce high levels of 
protein. 

Preferred promoters are active in the mammary tissue. Particularly useful are 
promoters that are specifically active in genes encoding milk specific proteins such 
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as genes found in mammary tissue, i.e. are more active in mammary tissue than in 
other tissues under physiological conditions where milk is synthesized- Most 
preferred are promoters that are both specific to and efficient in mammary tissue. 
Among such promoters, the casein, lactalbumin and lactalglobulin promoters are 
5 preferred, including, but not limited to the alpha, beta and gamma casein promoters 
and the alpha lactalbumin and beta-lactalglobulin promoters. Preferred among the 
promoters are those from rodent, goats and cows. Other promoters include those 
that regulate a whey acidic protein (WAP) gene. 

In a preferred embodiment of the invention, a modified nucleic acid 
10 encoding MSP-1 or fragments thereof capable of expression in a cell culture system, 
mammalian cell culture system or in the milk of a transgenic animal is provided- 
Nucleic acid sequences encoding the natural MSP-1 gene are modified in accordance 
O with the invention. First the overall AT content is reduced by replacing codons of 

^ the natural gene with codons recognizable to, and preferably with codons preferred 

7i 15 by the cell system of choice, that encode the same amino acid but are sufficient to 
m lower the AT content of the modified nucleic acid as compared to the native M5P-1 

hi gene or gene fragment. Second, mRNA instability motifs (AUUUA, Shaw and 

Kamen, supra) in the native gene or gene fragment are eliminated from the coding 
O sequence of the gene by replacing codons of the natural gene with codons 

P 20 recognizable to, and preferably prefrred by the cell system of choice that encode the 
^ same amino acid but are sufficient to eliminate the mRNA instability motif. 

Optionally, any other codon of the native gene may be replaced with a preferred 
codon of the expression system of choice as described. 

25 In a sixth aspect, the invention provides a DNA vaccine comprising a 

modified nucleic acid according to the invention. In certain preferred 
embodiments, the DNA vaccine conprises a vector according to the invention. The 
DNA vaccine according to the invention may be in the form of a "naked" or 
purified modified nucleic acid according to the invention, which may or may not be 
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operatively associated with a promoter. A nucleic acid is operatively associated with 
a promoter if it is associated with the promoter in a manner which allows the 
nucleic acid sequence to be expressed. Such DNA vaccines may be delivered 
without encapsulation, or they may be delivered as part of a liposome, or as part of a 

5 viral genome. Generally, such vaccines are delivered in an amount sufficient to 
allow expression of the nucleic acid and elicit an antibody response in an animal, 
including a human, which receives the DNA vaccine. Subsequent deliveries, at 
least one week after the first delivery, may be used to enhance the antibody 
response. Preferred delivery routes include introduction via mucosal membranes, 

10 as well as parenteral administration, 

A preferred embodiment of this aspect of the invention comprises a fragment 
of a modified MSP-1 gene according to the invention. Such fragment preferably 
includes from about 5% to about 100% of the overall gene sequence and comprises 
one or more modification according to the invention. 

15 Examples of codon usage from E.coli and human are shown in Fig. 3b. Fig. 3b 

shows the frequency of codon usage for the MSF-1 native gene as well as the 
modified MSP-1 gene of the invention and also compares the frequency of codon 
usage to that of E. coli and human genes. Codon usage frequency tables are readily 
available and known to those skilled in the art for a number of other expression 

20 systems such as yeast, baculovirus and the mammalian, systems. 

The following examples illustrate certain preferred modes of making and 
practicing the present invention, but are not meant to limit the scope of the 
invention since alternative methods may be utilized to obtain similar results. 

25 

Examples 

Creation of novel modified MSP-1 42 gene 

30 In one embodiment, a novel modified nucleic acid encoding the C-terminal 

fragment of MSP-1 is provided. The novel, modified nucleic acid of the invention 
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encoding a 42 kD C-terminal part of MSP-1 (MSP-I42 ) capable of expression in 
mammalian cells of the invention is shown in Fig. 1. The natural MSP-1 42 gene 
(Fig 2) was not capable of being expressed in mammalian cell culture or in 
transgenic mice Analysis of the natural MSP- I42 gene suggested several 
characteristics that distinguish it from mammalian genes. First, it has a very high 
overall AT content of 76%. Second, the mRNA instability motif, AUUUA, occurred 
10 times in this 1100 bp DNA segment (Fig 2). To address these differences a new 
MSP-I42 gene was designed. Silent nucleotide substitution was introduced into the 
native MSP-I42 gene at 306 positions to reduce the overall AT content to 49.7%. Each 
of the 10 AUUUA mRNA instability motifs in the natural gene were eliminated by 
changes in codon usage as well. To change the codon usage, a mammary tissue 
specific codon usage table. Fig. 3a, was created by using several mouse and goat 
mammary specific proteins. The table was used to guide the choice of codon usage 
for the modified MSP-I42 gene as described above. For example as shown in the 
Table in Fig. 3a, in the natural gene, 65% (25/38) of the Leu was encoded by TTA, a 
rare codon in the mammary gland. In the modified MSP-I42 gene, 100% of the Leu 
was encoded by CTG, a preferred codon for Leu in the mammary gland. 

An expression vector was created using the modified MSP-I42 gene by fusing the 
first 26 amino acids of goat beta-casein to the N-terminai of the modified MSP-I42 
gene and a Sall-Xho I fragment which carries the fusion gene was subcloned into the 
Xhol site of the expression vector pCDNA3. A His6 tag was fused to the 3' end of the 
MSP-I42 gene to allow the gene product to be affinity purified. This resulted in 
plasmid GTC627 (Fig.4c). 

To compare the natural MSP-I42 gene construct to the modified MSP-I42 nucleic 
acid of the invention, an expression vector was also created for the natural MSP-I42 
gene and the gene was added to mammalian cell culture and injected into mice to 
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form transgenic mice as follows: 



Construction of the native MSP-I42 Expression Vector 

5 To secrete the truncated the merozoite surface protein-1 (MSP-1) of Plasmodium 
falciparum, the wild type gene encoding the 42KD C-terminal part of MSP-1 
(MSP-I42) was fused to either the DNA sequence that encodes the first 15 or the first 
25 amino acids of the goat beta-casein. This is achieved by first PCR amplify the 
MSP-1 plasmid (received from Dr. David Kaslow, NIH) with primers MSPl and 

10 MSP2 (Fig. 6), then cloned the PCR product into the TA vector (Invitrogen), The 
Bglll-Xhol fragments of the PCR product was ligated with oligos OTl and OT2 (Fig. 
6) into the expression vector pCDNA3. This yielded plasmid GTC564 (Fig.4b), which 
encodes the 15 amino acid beta- casein signal peptide and the first 11 amino acids of 
the mature goat beta-casein followed by the native MSP-I42 gene. Oligos MSP-8 and 

15 MSP-2 (Fig, 6) were used to amplify MSP-1 plasmid by PCR, the product was then 
cloned into TA vector. The Xhol fragment was exercised and cloned into the Xhol 
site of the expression vector pCDNA3 to yield plasmid GTC479 (Fig.4a), which 
encoded 15 amino acid goat beta-casein signal peptide fused to the wild-type MSP-I42 
gene- A His6 tag was added to the 3' end of MSP-I42 gene in GTC 564 and GTC 479. 

20 

Native MSP-l^j Gene Is Not Expressed In COS-? Cells 

Expression of the native MSP gene in cultured COS-7 cells was assayed by transient 
transfection assays. GTC479 and GTC564 plasmids DNA were introduced into COS-7 
25 cells by lipofectamine (Gibco-BRL) according to manufacturer's protocols. Total 

cellular RNA was isolated from the COS cells two days post-transfection. The newly 
synthesized proteins were metaboiically labeled for 10 hours by adding ^^S 
methionine added to the culture media two days-post transfection. 

15 



To determine the MSP mRNA expression in the COS cells, a Northern blot was 
probed with a ^^p labeled DNA fragment from GTC479, No MSP RNA was detected 
in GTC479 or GTC564 transfectants (data not shown). Prolonged exposure revealed 
residual levels of degraded MSP mRNA, The ^^S labeled culture supernatants and 
5 the lysates were immunoprecipitated with a polyclonal antibody raised against MSP. 
Immunoprecipitation experiments showed that no expression from either the 
lysates or the supernatants of the GTC479 or GTC564 transfected cells (data not 
shown). These results showed that the native MSP-1 gene was not expressed in COS 
cells. 

10 

Native MSP-l^i ^^^^^ is Not Expressed in the Mammary Gland of Transgenic Mice 

15 The Sall-Xhol fragment of GTC479, which encoded the 15 amino acids of goat beta- 
casein signal peptide, the first 11 amino acids of goat beta-casein, and the native 
MSP-I42 gene, was cloned into the Xhol site of the beta-casein expressed in vector 
BC350. This yielded plasmid BC574 (Fig.7). A Sall-NotI fragment of BC574 was 
injected into the mouse embryo to generate transgenic mice. Fifteen lines of 

20 transgenic mice were established. Milk from the female founder mice was collected 
and subjected to Western analysis with polycolonal antibodies against MSP. None 
of the seven mice analyzed were found to express MSP-I42 protein in their milk. To 
further determine if the mRNA of MSP-I42 was expressed in the mammary gland, 
total RNA was extracted from day 11 lac ta ting transgenic mice and analyzed by 

25 Northern blotting. No MSP-I42 mRNA was detected by any of the BC 574 lines 
analyzed. Therefore, the MSP-I42 transgene was not expressed in the mammary 
gland of transgenic mice. Taken together, these experiments suggest that native 
parasitic MSP-I42 gene could not be expressed in mammalian cells, and the block is 
as the level of mRNA abundance. 
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Expression of MSP in the Mammalian Cells 

Transient transfection experiments were performed to evaluate the expression of 
5 the modified MSP-I42 gene of the invention in COS cells. GTC627 and GTC479 
DNA were introduced into the COS-7 cells. Total RNA was isolated 48 hours 
post-transfection for Northern analysis. The immobilized RNA was probed with 
32p labeled Sall-Xhol fragment of GTC627. A dramatic difference was observed 
between GTC479 and GTC627. While no MSP-I42 mRNA was detected in the 
10 GTC479 transfected cells as shown previously, abundant MSP-I42 mRNA was 

expressed by GTC627 (Fig. 5, Panel A). GTC 469 was used as a negative control and 
comprises the insert of GTC564 cloned into cloning vector PU19, a commercially 
available cloning vector. A metabolic labeling experiment with methionine 
followed by immunoprecipitation with polyclonal antibody (provided by D. Kaslow 
15 NIAID, NIH) against MSP showed that MSP-I42 protein was synthesized by the 
transfected COS cells (Fig.5, Panel B). Furthermore, MSP-I42 was detected in the 
transfected COS supernatant, indicating the MSP-I42 protein was also secreted. 
Additionally, using Ni-NTA column, MSP-I42 was affinity purified from the 
GTC627 transfected COS supernatant. 

20 

These results demonstrated that the modification of the parasitic MSP-I42 gene lead 
to the expression of MSP mRNA in the COS cells. Consequently, the MSP-I42 
product was synthesized and secreted by mammalian cells. 

25 Polyclonal antibodies used in this experiment may also be prepared by means well 
known in the art {Antibodies: A Laboratory Manual, Ed Harlow and David Lane, 
eds. Cold Spring Harbor Laboratory, publishers (1988)). Production of MSP serum 

antibodies is also described in Chang et al. Infection and Immunity (1996) 64:253-261 
and Chang et al, (1992) Proc Natl. Acad. Sci. USA 86:6343-6347. 
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The results of this analysis indicate that the modified MSP-I42 nucleic add of the 
invention is expressed at a very high level compared to that of the natural protein 
which wdiS not expressed at all. These results represent the first experimental 
5 evidence that reducing the AT % in a gene leads to expression of the MSP gene in 
heterologous systems and also the first evidence that removal of AUUUA mRNA 
instability motifs from the MSP coding region leads to the expression of MSP 
protein in COS cells. 

10 Thus, the data presented here suggest that certain heterologous proteins that may be 
difficult to express in cell culture or transgenic systems because of high AT content 
and/or the presence of instability motifs, and or the usage of rare codons which are 
unrecognizable to the cell system of choice may be reengineered to enable expression 
in any given system with the aid of codon usage tables for that system. The present 

15 invention represents the first time that a DNA sequence has been modified with the 
goal of removing suspected sequences responsible for degradation resulting in low 
RNA levels or no RNA at all. The results shown in the Fig. 5, Panel A Northern 
(i.e. no RNA with native gene and reasonable levels with a modified DNA sequence 
in accordance with the invention), Ukely explains the increase in protein 

20 production. 

The following examples describe the expression of MSPl-42 as a native 
non-fusion (and non-glycosylated) protein in the milk of transgenic mice. 

25 Construction of MSP Transgene 

To fuse MSPl-42 to the 15 amino acid B-casein signal peptide, a pair of oligos, MSP203 
and MSP204 (MSP203: ggccgctcgacgccaccatgaaggtcctcataatigcc 
tgtctgglggctctggccattgcagccgtcactccctccgtcat. MSP204: cgatgacggagggagtgacggctg 
30 caatggccagagccaccagacaggcaattalgaggaccttcaiggtggcgtcgagc). which encode the 15 amino acid - 



IS 



casein signal and the first 5 amino acid of the MSP 1-42 ending at the Cla I site, was ligated with a 
Cla I-Xho I fragment of BC620 (Fig. 8) which encodes the rest of the MSPl-42 gene, into the 
Xho I site of the expression vector pCDNA3. A Xho I fragment of this plasmid (GTC669) was 
then cloned into the Xho I site of milk specific expression vector BC350 to generate B670 (Fig.9) 

5 

Expression of MSPl-42 in the milk of transgenic mice 

A Sal I-Not I fragment was prepared from plasmid BC670 and microinjected into the 
mouse embryo to generate transgenic mice. Transgenic mice was identified by extracting mouse 

10 DNA from tail biopsy followed by PGR analysis using oligos GTC 17 and MSP 101 (sequences of 
oligos: GTC 17, GATTGACAAGTAATACGCTGTTTCCTC, OUgo MSPlOl, 
GGATTCAATAGATACGG). Milk from the female founder transgenic mice was collected at day 
7 and day 9 of lactation, and subjected to western analysis to determine the expression level of 
MSP4-42 using an polyclonal anti-MSP antibody and monoclonal anti MSP antibody 5,2 (Dr. 

15 David Kaslow, NIH). Results indicated that the level of MSP- 1-42 expression in the milk of 
transgenic mice was at 1-2 mg/ml (Fig. 10). 

Construction of MSPl-42 glycosylation sites minus mutants 

20 Our analysis of the milk produced MSP revealed that the transgenic MSP protein was 

N-glycosylated. To eliminate the N-glycosylation sites in the MSPl-42 gene, Asn. (N) at 
positions 181 and 262 were substituted with Gln.(Q). The substitutions were introduced by 
designing DNA oligos that anneal to the coiresponding region of MSPl and carry the A AC to 
GAG mutations. These oligos were then used as PGR primers to produce DNA fragments that 

25 encode the N to Q substitutions. 

To introduce N262-Q mutation, a pair of oligos, MSPGYLYCO-3 
(CAGGGAATGCTGCAGATCAGG) AND MSP42-2 (AATTCTGGAGTTAGTG 
GTGGTGGTGGTGGTGATGGCAGAAAATACCATG, FIG. 11), were used to PGR amplify 
30 plasmid GTC627. which contains the synthetic MSPl-42 gene. The PGR product was cloned into 
pGR2.1 vector (Invitrogen). This generated plasmid GTC7 16. 

To introduce N181-Q mutation, oligos MSPGLYGO-1 (CTGCTTGTTGAGG 
AACTTGTAGGG) and MSPGLGO-2 (GTGGTGGAGTAGAGATATGAG, Fig 4) were used to 
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amplify plasmid GTC 627. The PGR product was cloned into pCR2.L This generated plasmid 
GTC700. 



The MSP double glycosylation mutant was constructed by the following three steps: first, 
5 a Xho I-Bsm I fragment of BC670 and the Bsm 1-Xho I fragment of GTC7 16 is ligated into the 
Xho I site of vector pCR2. L This resulted a plasmid that contain the MSP- 1-42 gene with 
N262-Q mutation. EcoN I-Nde I fragment of this plasmid was then replaced by the EcoN I-Nde I 
fragment from plasmid GTC7 16 to introduce the second mutation, N181-Q. A Xho I fragment of 
this plasmid was finally cloned into BC350 to generate BC718 (Fig. 12). 

10 

Expression of nonglycosylated MSPlin transgenic animals 

BC718 has the following characteristics: it carries the MSP 1-42 gene under the control of 
the B-casein promoter so it can be expressed in the mammary gland of the transgenic animal during 

O 15 lactation. Further, it encodes a 15 amino acid 6-casein leader sequence fused directly to MSP 1-42, 
so that the MSPl-42, without any additional amino acid at its N-terminaL can be secreted into the 

\l milk. Finally, because the N-Q substitutions, the MSP produced in the milk of the transgenic 

W animal by this constaict will not be N-glycosylatcd. Taken together, the transgenic MSP produced 

m in the milk by BC7 18 is the same as the parasitic MSP. 

W 20 A Sall/Xhol fragment was prepared from plasmid BC7 18 and microinjected into mouse 

embryos to generate transgenic mice. Transgenic animals were identified as described previously. 
O Milk from female founders was collected and analyzed by Western blotting with antibody 5.2. The 

j£ results, shown in Figure 13, indicate expression of nonglycosylated MSPl at a concentration of 

^ 0.5 to 1 mg/ml. 

W 25 

Equivalents 

Those skilled in the art will recognize, or be able to ascertain, using no more than 
routine experimentation, numerous equiv^alents are considered to be v^ithin the 
30 scope of this invention, and are covered by the following claims. 
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What is claimed is: 

1. A modified known nucleic acid of a parasite which is capable of being 
expressed in a mammalian cell wherein the modification comprises a reduction of 

5 AT content of the gene by replacing one or more AT containing codons in the gene 
with a preferred codon encoding the same amino acid as the replaced codon. 

2. A modified known nucleic acid of a parasite protein which is capable of being 
expressed in a mammalian cell wherein at least one mRNA instability motifs 

10 present in the gene coding sequence is eliminated by replacing said mRNA 
instability motif with a preferred codon encoding the same amino acid as the 
replaced codon. 

3. The modified nucleic acid of claim 1 or 2 wherein at least one or more codons 
15 of the known gene is replaced by a preferred milk protein specific codon encoding 

the same amino acid as the replaced codon. 

4. A modified known nucleic acid of a parasite which is capable of being 
expressed in a mammalian cell, wherein the overall AT content of the knowm gene 

20 encoding is lowered by replacement with a milk protein specific codon, and wherein 
at least one mRNA instability motif present in the gene is eliminated by 
replacement with a milk protein specific codon and at least one codon of the natural 
gene is replaced by a preferred milk protein specific codon. 

25 5, The modified nucleic acid of claim 4 wherein said modified nucleic acid is 
capable of expressing said protein at a level which is at least 100% of that expressed 
by said natural gene in an in vitro or in vivo mammalian cell system. 
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6. A method for preparing a modified known nucleic acid of a parasite for 
expression in a mammalian cell comprising lowering the AT content of the natural 
gene by replacing one or more AT containing codons of the natural gene with a 
preferred mammary specific codon encoding the same amino acid as the replaced 
codon. 

7. A method for preparing a modified known nucleic acid of a parasite protein 
for expression in a mammalian cell comprising eliminating at least one mRNA 
instability motif present in the gene coding sequence by replacing one or more 
mRNA instability motif in the gene with a mammary specific codon encoding the 
same amino acid as the replaced codon. 

8. The method of claim 5 or 6 further comprising replacing one or more codons 
in the natural gene encoding said protein with a preferred mammary specific codon 
encoding the same amino acid as the replaced codon. 

9. A modified nucleic acid sequence prepared by the method according to claim 
5 or 6, 

10. A method for preparing a modified known nucleic acid of a parasite for 
expression in a mammalian cell comprising the steps of: 

a) eliminating at least one mRNA instability motif present in the natural gene 
encoding said protein by replacing one or more mRNA instability motifs in the gene 
with a preferred milk protein specific codon encoding the same amino acid as the 
replaced codon; 

b) lowering the AT rich content of the natural gene encoding said protein by 
replacing one or more AT containing codons of the gene with a milk protein specific 
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codon encoding the same amino acid as the replaced codon; and 

c) replacing one or more codons in the natural gene encoding said protein with 
a preferred mammary specific codon encoding the same amino acid as the replaced 
codon. 

11. A modified nucleic acid prepared by the method according to claim 10. 

12. A modified nucleic acid of claim 1 wherein said parasite is malaria and said 
nucleic acid is a fragment of SEQ ID NO 1 or SEQ ID NO 9 or a sequence specifically 
homologous thereto. 

13. 12. A modified nucleic acid of claim 1 wherein said parasite is malaria and 
said nucleic acid is or SEQ ID NO 9 or a fragment thereof or a sequence specifically 
homologous thereto. 

14. A modified nucleic acid that is a fragment of SEQ ID NO 1 or a sequence 
specifically homologous thereto capable of being expressed in a cell system wherein 
the AT content of the natural gene is lowered by replacement of one or more codons 
with codons recognizable by said cell culture system coding for the same amino acid 
as the replaced codon but which effectively lower the overall AT content of the 
natural gene. 

15. A modified nucleic acid that is a fragment of SEQ ID NO 1 or a sequence 
specifically homologous thereto, capable of being expressed in a cell system wherein 
at least one mRNA instability motif present in the natural gene coding sequence is 
eliminated by replacing one or more codons comprising said instability motif with a 
codon recognizable by said cell system which effectively eliminates said instability 
motif and encodes the same amino acid as the replaced codon. 
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16. The modified nucleic acid of claims 14 or 15 wherein at least one or all codons 
of the natural gene are replaced with preferred codons of said cell system. 

5 17. A vector comprising the modified nucleic acid of claim 12. 

18- A host cell transfected or transformed with a vector of claim 17. 

19. A transgenic expression construct comprising the modified nucleic acid of 
10 claim 12. 

20. A transgenic non-human animal whose germline comprises the modified 
nucleic acid of claim 12. 

15 21. A transgenic expression vector for the production of a transgenic animal 
comprising a promoter, operatively associated with the modified nucleic acid of 
claim 12, wherein said promoter directs mammary gland expression of the protein 
encoded by said modified nucleic acid into the animal's milk. 

20 22. A modified known nucleic acid of a bacterium, virus, or parasite which is 
capable of being expressed in a cell system wherein the AT content of the gene is 
lowered by replacement of one or more codons with codons recognizable by said cell 
system coding for the same amino acid as the replaced codon, but which effectively 
lower the overall AT rich content of the natural gene. 

25 

23. A modified nucleic acid of a bacterium, virus, or parasite which is capable of 
being expressed in a cell system wherein at least one mRNA instability motifs 
present in the gene coding sequence is eliminated by replacing one or more codons 
comprising said instability motif with a codon recognizable by said cell system which 
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effectively eliminates said instability motif and encodes the same amino acid as the 
replaced codon, 

24. A modified nucleic acid of claims 22 or 23, wherein at least one or all codons 
5 of the natural gene are replaced with preferred codons of said cell system. 

25- A DNA vaccine comprising a modified nucleic acid according to claim 24. 

26. A DNA vaccine comprising a v^ector according to claim 17. 

10 
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Abstract of the disclosure 

The invention provides modified recombinant nucleic acid sequences (preferably 
DNA) and methods for increasing the mRNA levels and protein expression of 
proteins which are known to be^ or are likely to be, difficult to express in cell culture 
systems, mammalian cell culture systems, or in transgenic animals. The preferred 
"difficult" protein candidates for expression using the recombinant techniques of 
the invention are those proteins derived from heterologous cells preferably those of 
lower organisms such as parasites, bacteria, and virus, having DNA coding 
sequences comprising high overall AT content or AT rich regions and /or mRNA 
instability motifs and /or rare codons relative to the recombinant expression system 
to be used. 
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1 GcGbrrCkcTCcGiKXxn^CXTChAT^^ 

l^AI aVa I ThrProSer Val I I eAspAsnl I eLeuSer Lys I I eGi uAsnGI uTyr G 

56 acx3tQ:iQiaQ::tQ^aQx:CE'i<^ 

19^ I uVa I LeuTyr LeuLy s Pro LeuAl aGl y Va I Tyr Ar gSer LeuLysLysGI n 

109 CK3c^G?aQvacgtQ^tcacGitC^a 
37^ Leu Gl uAsnAsnVa 1 MetThr PheAsn Va I AsnVa I Lys Asp I 1 e LeuAsn Se r 

163 QCGirCXAQw^GoGGGAQ^CkTCA^ 
55^A rg PheAsn Ly sA rgGi uAs nPheLy sAs n Va I LeuGt u Se r Asp Leu I I ePr 

216 OiAQ^GGATCW:3k:GAQa 

72> oTyr Ly sAspLeuThr Se r Se r AsnTyr Va ( Val Ly sAspProTyr Ly s Phe L 

269 l<^0^(!hAG?sA<^^ 
90^euAsnLysGI uLysArgAspLysPheLeuSerSerTyr AsnTyr 1 I eLysAspSe 

324 CXT'IbATAcQsATATQ^AChTCbcC^ 

10S> r I I eAspThrAspI I eAsnPheAl aAsnAspVa I LeuGI yTyr Tyr Ly s ! 1 e Le 

378 Qixxx^G^Qj3^Cpa^<3^^^ 

126> uSer Gi u Ly sTyr Ly s Se r AspLeuAspSe r 1 1 eLy s Ly sTyr 1 I eAsnAspLy 

432 GcAGsGASAQ^GSAGAAGrAcCbKj^ 
0.44^ sGI nGl y Gl uAsnGl uLy sTyrLeuProPhe LeuAsnAsn I I eGl uThr LeuTy 

%86 CkACi^cCbTC^ChATAAQ^TTGATCh<3^ 

^S2> r Ly sThr Va I AsnAspLy s I I eAspLeuPheVa I I I e Hi s Le u Gi uAI aLy s Va 



ir540 GC7rGe^AQ?ACXCATATGAGAAC50rAACGTGG^ 
ttso^ I Le uAs nTyr Thr Ty rGI uLy s S e r AsnVa I Gl u Va I Ly s M e Ly s Gl u Leu As 

%94 TTAcCh<iAQ^cCkTGcAGsATAAQCT^ 

r^l98^ nTyr Leu Lys Thr I I eGI nAs pLys Leu A! aAspPh e Ly s Ly s As nAsnAs n Ph 

0648 ccrCbGQvTGscGGATChKS^QCAcG^ATO^ 

ry216^ eVa I Gl y I 1 eAl aAspLeu Se r Th r AspTyr AsnHi sAsnAsnLeu LeuThr Ly 

^7 02 GTTCCrGk^CAcCbGTATGGTChTrGsAA^ tGPQC AA 

^234^ sPheLeuSer Th r Gl yMet Va I PheGI uAsn LeuAl aLy sThr Va I Leu Se r As 

"756 Gi;TGcTGsATCGQ^AcCh?GCAG3GG\TC 
252 ► n LeuLeuAspGl yAsn LeuGI nGl yMet LeuAsnl 1 e S e r Gl n Hi sGl nCys Va 

810 GaaQvaGcaQtgtccGcaG^CZ^QCcsgC^ 

270^ I Ly s Ly s Gl nCysProGl nAs n Se r G( y CysPheArgHi s LeuAspGI uA rgGl 

864 Q^AQrGTAAGrorGrQC tG^aGtacaaGcaC^ 

288^ uGl u CysLy sCys Leu Le u As n Ty rLy s Gl nGl uGi yAspLy s CysVa I Gl uAsn 
919 CcCAATCCTACTTGTAACGACi^CAATGGTGGATG^^ 

307> ProAsnProThr CysAsnGI u As nAs nGl yGi yCysAspAl aAspAl aLy s CysThr G 
977 AGGAGGATTCAGGCivGCAACGGG?iAGAACi^TCAC GtGTGaGtgTAC CaaGcCTGATT 
326> i uG! uAspSerGi y Ser AsnGI y Ly sLy s II eTh r CysGl uCysThr Ly sProAspS 

1034 cttatccactQitk:gatggtatQ?tctg^ 
345^ erTyrProLeuPheAspGI y 1 1 ePheCysSer 
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1 GCAGTAACTCCTTCCGTAATT . AACATACTTTCTAAAATTGAAAATGAAT^ ^ 

l^AI aVal Thr ProSer Val M eMspAsn I I eLeuSer Lys 1 1 eGI uAsnGI uTy a 

EcoNI (73) 

56 AGGTTTTATATTTAAAACCTTTAGCAGGTGT^ 

19^ I uVal LeuTyrLeuLysProLeuAI aGI yVal TyrArgSer LeuLysLysGI nLe 
111 AGAAAATAACGTTATGACATITAATGTTAATC^ 

37^ uGl uAsnAsnValMetThr PheAsnValAsnVal Lys Asp I I eLeuAsnSer A rg 
166 TTTAATAAACGTGAAAAITIX::^^ 

56^ PheAsnLysArgGl uAsnPheLysAsnVai LeuGl uSerAspLeu I I eProTy rL 
221 AAGATTTAACATCAAGTAATTATGTTGTCAAAC3A 

74^ysAspLeuThr Ser SerAsnTyrVal Val LysAspProTyrLy sPheLeuAsnLy 
276 AGAAAAAAGAGATAAATTCTTAAGCAGTTATAATTATAT^^ 

92^ sGI uLysArgAspLysPheLeuSer Ser TyrAsnTy ri I eLysAspSer 1 1 eAsp 
331 ACGGATATAAATTTTGCAAATCATGTTC^^ 

lll^ThrAspl I eAsnPheA! aAsnAspVal LeuGI yTyrTyrLys 1 1 eLeuSerGI uL 
386 AATATAAATCAGATTTAGATTCAATTAAAAAATATATC^ 
129^ysTyrLysS6rAspLeuAspSer I leLysLysTyrl I eAsnAspLysGI nGI yGI 
441 AAATGAGAAATACCnrcCCTITITAA^ 

147^ uAsnGI uLysTyrLeuProPheLeuAsnAsnl I eGI uThr LeuTyrLysThr Val 
496 AATGATAAAATTCAirrATTTGTAATT^ 

166^AsnAspLys I I eAspLeuPheVal I I eHi sLeuGI uAI aLysVal LeuAsnTyrT 

551 CATATGAGAAATCAAACGTAGAAGTTAAAATAAAAGAACITAATO 

184^ hrTyrGI uLysSerAsnVa i GI uVal Lysl I eLysGI uLeuAsnTy rLeuLysTh 

606 aattcaagacaaattggcagattttaaaaaa;^ 

W^2> r I I eG! nAspLysLeuAl aAspPheLysLysAsnAsnAsnPheVal Gl y 1 1 eAI a 

^1 gatttatcaacagattataaccataataacttattgacaaagt^ 
.^l^AspLeuSer ThrAspTyrAsnHi sAsnAsnLeuLeuThr Ly sPheLeuSer Thr G 

"^JL 6 GT ATGGTTTTTGAAA^TCTTGCTAAAACCGTT^ 

||p9^ I yMe t Va ! PheGI uAsnLeuAl aLy sThr Va i LeuSer AsnLeuLeuAspGf yAs 

1 CTTGCAAGGTATGTTAAACATTTCACAACACCAATC^ 
=%57^ nLeuG! nGI yMet LeuAsn 1 I eSer Gl nHi sG( nCysVai LysLysGf nCysPro 
ig2 6 CAAAATTCTGGATGTTTCAGACATTTAGATGAAAG 

J276^ Gi nAsnSer Gl yCysPhe ArgHi sLeuAspGI uArgG! uGI uCysLysCysLeuL 
^8 1 TAAATTACAAACAAGAAGGTGATAAATGTGTTGAAAATCCAAATCCTACl^ 
pi94^ euAsnTyrLy sGI nGI uGl yAspLysCysVa ! Gl uAsnProAsnP roThr GysAs 
^36 CGAAAATAATGGTGGATGTGATGCAGATGCCAAATGTACCGAAGAAGAT^^ 
!Jl2^ nGI uAsnAsnGI yGI yCysAspAl aAspAl aLysCysThr GI uGl uAspSer Gl y 
^91 AGCAACGGAAAGAAAATCACATGTGAATGTACTAAACCTGATT^^ 
^31^ SerAsnGI yLysLysl I eThrCysGl uOysThr LysProAspSer TyrProLeuP 
Vi PstI (1059) 

1046 TCGATGGTATTTTCTGCAGTCACCACCACCACC^ 
349^ heAspGl y I I ePheCysSer Hi sHi sHi sHi sHi sHi s* • • 
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Oligos used: 



OTl: 

TCG ACQ AGA GCC ATG AAG GTC CTC ATC C7T GCC TQT CTG GTG GCT 
CTG GCC ATT GCA AGA GAG GAG GAA GAA CTC AAT GTA GTC GGT A, 

OT2: 

GAT CTA CCG ACT ACA TTG AGT TCT TCC TGC TCT CTT GCA ATG GCC 
AGA GCC ACC AGA CAG GCA AGG ATG AGG ACC TTC ATG GCT CTC G. 

MSP1: 

AATAGATCTGCAGTAACTCCTTCCGTAATTG, 
MSP2: 

AATTCTCGAGTTAGTGGTGGTGGTGGTGGTGACTGCAGAAATACCATC 
MSP8: 

TAACTCGAGCGAACCATGAAGGTCCTCATCCTTGCCTGTCTGGTGGCTCTGG 
CCATTGCA 



Clal (24581) 

Xhol^ (24459) / EcoNi (24829) Nde! (,25113) 
II ) 




BsmJ (25341) 



Xhol (25649) 




Diagram of BC620 



FIGURE 8 



Western Analysis of MSP transgenic milk. 

Lane 1, Molecular weight marker; lane 2, nontransgenic nuce milk; lane 3, milk ttom 
BC628-146 transgenic mouse; lane 4-9, milk from BC670 transgenic mice. The blot was 
reacted with monoclonal antibody 5.2 against MSP. 
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26 ATGAAGGTCXrrcATAATTGCCTGTCTGGT03CT^^ 
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159 CaAGftAGCAGCTOGAGAACAA):^^ 
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241 CAAGAGGGAGAACITCAAGAAanOTXX^^ 

72> KRENFKNVLESOLI PYKDLTSSNY 

EcoNl (337) 
313 CGTCXrrciAAXGATCCCTAC:7U\^^ 



96^ VVKDPYKFLNKEKRDKFLSSYNYI 
385 OU^GGATAGCATTGACACCXaATATC^^ 
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Ndel (621) 

601 QGCCAAGGOCCTGCAOTACACAXATGAGAAGAC^^ 
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